Words Stemming Based on Structural and Semantic Similarity
نویسندگان
چکیده
منابع مشابه
Stemming of French Words Based on Grammatical Categories
Automatic indexing systems use suffix stripping algorithms to cluster various words derived from a common root under the same stem. Currently, removing affixes to either a context-free or context-sensitive operation, where the context refers to the remaining stem. In this article, we propose a suffixing algorithm which uses grammatical categories to enhance the stemming process. This approach s...
متن کاملCombining Similarity-based Approaches for Semantic Tagging of Unknown Words
This paper presents a method for semantic classification of unknown words which are not described in the Japanese thesaurus dictionary, called Bunrui-Goi-Hyo(BGH). The method combines three approaches based on similarity measure: (1) editdistance, (2) k-th Reciprocal Nearest Neighbors(RNN), and (3) semi-supervised clustering, each of them is different in the aspect of unknown words. The final o...
متن کاملViewpoint-Based Measurement of Semantic Similarity between Words
A method of measuring semantic similarity between words using a knowledge-base constructed automatically from machine-readable dictionaries is proposed. The method takes into consideration the fact that similarity changes depending on situation or context, which we calìview-point'. Evaluation shows the proposed method, although based on a simply structured knowledge-base, is superior to other c...
متن کاملA Laplacian Eigenmaps Based Semantic Similarity Measure between Words
The measurement of semantic similarity between words is very important in many applicaitons. In this paper, we propose a method based on Laplacian eigenmaps to measure semantic similarity between words. First, we attach semantic features to each word. Second, a similarity matrix ,which semantic features are encoded into, is calculated in the original high-dimensional space. Finally, with the ai...
متن کاملText Segmentation Based on Similarity between Words
This paper proposes a new indicator of text structure, called the lexical cohesion pro le (LCP), which locates segment boundaries in a text. A text segment is a coherent scene; the words in a segment are linked together via lexical cohesion relations. LCP records mutual similarity of words in a sequence of text. The similarity of words, which represents their cohesiveness, is computed using a s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Computer Engineering and Applications Journal
سال: 2014
ISSN: 2252-5459,2252-4274
DOI: 10.18495/comengapp.v3i2.57